Molecular evidence for natural hybridization between Rumex crispus and R. obtusifolius (Polygonaceae) in Korea

Interspecific hybridization has been suggested to occur frequently in Rumex (Polygonaceae). Several hypothesized combinations of parental species of hybrids based on their intermediate morphology have been suggested in the genus, but few of them have been phylogenetically tested. We analyzed nuclear and chloroplast DNA sequence data of a putative natural hybrid between Rumex crispus and Rumex obtusifolius from Korea to confirm its hybrid status and to determine the maternal parent. Analysis of the nuclear DNA pgiC region revealed that R. crispus and R. obtusifolius have contributed to the nuclear genome of the putative hybrids. The haplotype distribution pattern inferred from the combined sequence data set of five chloroplast DNA regions (matK, rbcL-accD IGS, trnK-rps16 IGS, ycf6-psbM IGS and psbA-trnH IGS) indicated bidirectional hybridization events between R. crispus and R. obtusifolius. This paper provides the first molecular evidence for interspecific hybridization between R. crispus and R. obtusifolius. In addition, our findings strongly suggested that Korean populations of Rumex japonicus have a hybrid origin, and R. crispus may represent one of the parental taxa.

Based on the combined data set, six cpDNA haplotypes were identified from 70 accessions of R. crispus, R. obtusifolius, R. japonicus and the putative hybrids ( Table 2, Supplementary Table S1). Four haplotypes (C1, www.nature.com/scientificreports/ C3-C5) were recovered from 17 accessions of R. crispus sampled for this study. Among those haplotypes, haplotypes C3-C5 were also detected in accessions from pure, non-mixed populations of R. crispus, but haplotype C1 was restricted to those from mixed populations in the Naesosa area ( Table 2, Supplementary Table S1). In R. crispus, haplotype C5 was detected in one accession and haplotypes C3 and C4 were detected in three and four accessions, respectively. In contrast to R. crispus, a single haplotype (C2) was detected in all accessions of R. obtusifolius. A total of five haplotypes (C1-C5) were recovered from the putative hybrids. In R. japonicus, two haplotypes (C1 and C6) were detected ( Table 2, Supplementary Table S1).
The distribution of cpDNA haplotypes in six populations in the Naesosa (NS1-NS6) area and one population on Gadeok Island (GD1) was examined (Supplementary Table S1). In populations NS1 and NS6, R. crispus contained haplotype C1, whereas the putative hybrids exhibited either haplotype C2 (three accessions) or C5 (five accessions). Rumex japonicus in population NS1 contained haplotype C1. In population NS2, R. crispus possessed either haplotype C1 (three accessions) or C4 (one accession), whereas all seven accessions of the putative hybrids exhibited haplotype C2. Rumex japonicus possessed either haplotype C1 (one accession) or C6 (two accessions). Two accessions of R. crispus sampled from population NS3 possessed haplotype C1. Haplotype C2 occurred in two putative hybrid individuals sampled from the same population. Rumex japonicus in this population exhibited haplotype C6 (Supplementary Table S1).
In population NS4, four haplotypes (C1, C3, C4 and C5) were detected in six accessions of R. crispus (Supplementary Table S1). Those same haplotypes were shared by nine accessions of the putative hybrids sampled from the same population. Three other accessions of the putative hybrids had haplotype C2. In particular, haplotype C3 was not found in any other populations in the Naesosa area, but was detected in accessions of R. crispus and the putative hybrids from the distantly located population on Gadeok Island (GD1). In population NS5, R. crispus possessed haplotype C1, whereas the putative hybrids possessed haplotypes C1 (three accessions), C2 (one accession) or C5 (two accessions). In population NS6, R. crispus possessed haplotype C1, whereas the putative hybrids possessed haplotypes C2 (two accessions) or C5 (two accessions). In population GD1 of Gadeok Island, accessions of both R. crispus and the putative hybrids shared haplotype C3; no putative hybrids shared a cpDNA haplotype with R. obtusifolius (Supplementary Table S1).
Nuclear DNA pgiC. A total of 85 sequences of the nDNA pgiC region was recovered from 57 accessions representing R. crispus, R. obtusifolius, R. japonicus and the putative hybrids (Fig. 3). In addition, we examined the pgiC sequences of nine accessions of R. crispus and six of R. obtusifolius obtained from pure, non-mixed populations of each species (Supplementary Table S1). The length of pgiC was 978 bp in R. crispus, 996 bp in R. obtusifolius, 968-1249 bp in R. japonicus, and 978-996 bp in the putative hybrids. The length of the pgiC data set after alignment was 1387 bp (Table 1). There were 284 (20.5%) variable characters, 132 (9.5%) of which were parsimony informative (Table 1).
All accessions of R. crispus and R. obtusifolius and eight accessions of the putative hybrids had a single pgiC sequence type, but 19 accessions of the putative hybrids had two sequence types. Accessions of R. japonicus had two or three sequence types (Fig. 3). Five pgiC haplotypes (N1-N5) were identified from 57 accessions of R. crispus, R. obtusifolius, R. japonicus, and the putative hybrids (Fig. 3, Supplementary Table S1). All accessions of R. crispus exhibited haplotype N2, whereas those of R. obtusifolius possessed haplotype N3. Four haplotypes (N1, N2, N4 and N5) were detected in R. japonicus. Accessions of the putative hybrids possessed either haplotype N2 or N3 only, or both N2 and N3 pgiC haplotypes (Fig. 3).
Phylogenetic analyses. The maximum parsimony (MP) analysis of the combined cpDNA data set resulted in a single most parsimonious tree with 278 steps (CI = 0.989, RI = 0.994) (Fig. 2). The majority-rule consensus tree obtained from Bayesian inference (BI) analysis of the same data set was basically identical to the MP tree in topology and groupings ( Supplementary Fig. S1). In the cpDNA tree (Fig. 2), none of the species included in this study were resolved as monophyletic. Accessions of R. obtusifolius formed a weakly supported clade (BS = 64, www.nature.com/scientificreports/   (Fig. 2). The results strongly suggested that hybridization has occurred in the populations sampled. The MP analysis of the nDNA pgiC data set resulted in a single most parsimonious tree with 345 steps (CI = 0.939, RI = 0.990) (Fig. 3). The majority-rule consensus tree obtained from BI analysis of the same data set www.nature.com/scientificreports/ was identical to the MP tree in topology and groupings ( Supplementary Fig. S2). In the nDNA tree (Fig. 3 www.nature.com/scientificreports/ all 19 cloned accessions of the putative hybrids recovered two haplotypes, N2 and N3, which were present in all accessions of R. crispus and R. obtusifolius, respectively. Eight directly sequenced accessions of the putative hybrids recovered either haplotype N2 or N3 (Fig. 3).
The statistical parsimony analysis of the combined cpDNA sequences as implemented in TCS resulted in a network of six haplotypes (Fig. 4). Those haplotypes were separated by one to 16 mutations. Fifteen haplotypes inferred by TCS were not found in the analyzed individuals and occurred as missing haplotypes in the network. Four haplotypes (C1, C3, C4, and C5) belonged to R. crispus; all those haplotypes were shared with hybrid accessions. The haplotypes in R. crispus were separated by one (C3 and C4) to 16 (C1 and C5) mutation steps. Only one haplotype (C2) was found in R. obtusifolius, and it was shared with hybrid accessions. Haplotype C6 was found only in R. japonicus.

Discussion
In the present study, we detected putative hybrid individuals with intermediate morphology from mixed populations of R. crispus and R. obtusifolius in seven localities in Korea (Fig. 1). The cloned nDNA pgiC sequences of the 19 accessions of the putative hybrids sampled from those populations revealed two types of sequences in each accession, haplotypes N2 and N3. Haplotypes N2 and N3 were exhibited by R. crispus and R. obtusifolius, respectively. Haplotype N2 differed from haplotype N3 by 54 bp substitutions, and there were no shared pgiC haplotypes between the two species (Fig. 3, Supplementary Table S2). The intermediate morphology and cooccurrence of two divergent pgiC haplotypes corresponding to R. crispus and R. obtusifolius indicated that those individuals were F1 hybrids between them. The putative hybrid accessions shared cpDNA haplotypes with either R. crispus or R. obtusifolius, suggesting that both species served as the maternal parent.
In comparison, eight directly sequenced accessions of the putative hybrids with intermediate morphology recovered either nDNA pgiC haplotype N2 (one accession) or N3 (seven accessions) (Fig. 3). Since these putative hybrid individuals occur only in mixed populations of the presumed parental species, and most of their flowers failed to set fruit, became dry and fell before any appreciable enlargement of the valves, it is highly likely that they www.nature.com/scientificreports/ represent backcross generation of F1 hybrids. Repeated backcrossing to either one of parental species may lead to elimination of one of the two copies initially present in the nuclear genome of F1 hybrids 32 . In particular, six putative hybrid accessions (273, 803, 811, 813, 817, 826) shared nDNA pgiC haplotype N3 with R. obtusifolius, but shared cpDNA haplotypes (C1, C3-C5) with C. crispus (Figs. 2, 3). The findings strongly suggested that backcrossing or introgression has occurred in these populations to at least some extent. Maternal inheritance of chloroplasts, which is typical in angiosperms, has been experimentally demonstrated in the Polygonaceae 18 . If maternal chloroplast inheritance is also assumed to occur in Rumex, then the pattern of haplotype distribution revealed in our study suggests bidirectional hybridization events between R. crispus and R. obtusifolius. In population NS4, for example, nine accessions of the putative hybrids comprised four cpDNA haplotypes (C1, C3-C5), all of which were detected in R. crispus accessions from the same population. On the other hand, the other three accessions of the putative hybrids contained haplotype C2 detected in R. obtusifolius accessions also from the same population (Supplementary Table S1). The results strongly indicated that hybridization between the two species was bidirectional, and both R. crispus and R. obtusifolius served as either paternal or maternal parent. Bidirectional hybridization between R. crispus and R. obtusifolius has been suggested in previous studies 1,2 . Bidirectional hybridization in plants is relatively common and has been reported in several other plant taxa 18,[33][34][35] .
The cpDNA haplotype C3 detected in population NS4 was not recovered from any other populations in the Naesosa area, but was detected in accessions of R. crispus and putative hybrids from the distantly located Gadeok Island population (GD1); populations NS4 and GD1 are separated by over 200 km. It is possible that we were unable to detect this haplotype in other populations in the Naesosa area due to the relatively small sample size. Conversely, the results may suggest the occurrence of long-distance seed dispersal. In Rumex, achenes are dispersed by wind or occasionally through excreta of mammals and birds 36 . Birds occasionally eat seeds of R. crispus when other quality food is not available, and seedlings have been raised from the excreta of various birds 36,37 .
The detection of four cpDNA haplotypes (C1, C3-C5) in only six accessions of R. crispus in population NS4 indicated that R. crispus is highly polymorphic for cpDNA haplotypes (Supplementary Tabel S1). The haplotype network revealed that haplotypes C1 and C5 possessed by the accessions of R. crispus in population NS4 were markedly different from each other and separated by 16 mutation steps (Fig. 4); this strongly suggested that haplotypes C1 and C5 probably diverged long ago. In contrast, all accessions of R. crispus from all populations had identical nuclear haplotypes. Since R. crispus invaded Korea relatively recently, the co-occurrence of highly divergent cpDNA haplotypes in populations of R. crispus may suggest repeated invasions from multiple sources with different genetic backgrounds 10,38 .
Although cpDNA haplotype C1 and nDNA pgiC haplotype N1 were shared by R. japonicus, R. crispus and some putative hybrid accessions, it is unlikely that R. japonicus participated in the formation of those hybrid individuals, as one of the two cpDNA haplotypes and three of the four nDNA pgiC haplotypes exhibited by R. japonicus were unique and not detected in any putative hybrid accessions (Figs. 2, 3, 4). In contrast, all the cpDNA and nDNA pgiC haplotypes of R. crispus were detected in the putative hybrids (Figs. 2, 3, 4). The results strongly suggested that R. crispus, rather than R. japonicus, served as one of the parental species of the putative hybrids.
The cloning of five accessions of R. japonicus recovered multiple pgiC sequence types; two accessions (r130, 409) had three sequence types (haplotypes N2, N5 and N1 or N4), and three (413, 427, 437) had two sequence types (haplotypes N1 or N2 and N5) (Fig. 3). Among the multiple pgiC haplotypes recovered from R. japonicus, haplotype N2 was shared by R. crispus (Fig. 3). Rumex japonicus possessed two cpDNA haplotypes, one of which was shared with R. crispus. Our findings suggested that Korean populations of R. japonicus have a hybrid origin and that R. crispus may represent one of the parental taxa. However, the hybrid status and parentage of the Korean populations of R. japonicus were beyond the scope of this study. Further studies are needed to validate the hybrid origin and parental taxa involved.
Putative hybrid individuals in all populations bore a few fruit containing potentially viable seeds. Molecular data supported the F1 hybrid or backcross nature of these individuals. The fertility of F1 hybrids and backcrosses between R. crispus and R. obtusifolius have been reported previously 1,2,39 . As such, putative hybrids and backcrosses may lead to the formation of hybrid swarms. Hybrid individuals between R. crispus and R. obtusifolius reported from Europe, Australia and North America have been referred to as R. x pratensis Mert. & W. D. J. Koch [1][2][3]8,40 .
Analyses of chloroplast and nuclear DNA sequence data in the present study provided compelling evidence for the occurrence of natural bidirectional hybridization between R. crispus and R. obtusifolius. The hybrids arise frequently in mixed populations of the two parental species in Korea. Confirmation of interspecific hybridization between recently introduced alien species such as R. crispus and R. obtusifolius would be significant because of the possible ecological consequences of hybrids and hybrid swarms. The hybrid populations we sampled were located in disturbed sites including roadsides, abandoned fields, forest clearings, and riparian zones. Thus, it appears that habitat disturbance may favor the formation of interspecific hybrids in Rumex. Frequent hybridization between species of Rumex in disturbed habitats has been reported previously 1,2,4 . The possible existence of F2 progeny and backcross generations in Rumex have been reported in previous studies 2,4 . The fertility of backcross generations usually increases as compared to that of the F1 generation in Rumex and in several other angiosperms, including Trifolium L. and Helianthus L. 2,41,42 The increased fertility of backcross generations in hybrid swarms may result in the loss of genetic integrity of the parental species. In addition, it is possible that F1 hybrids and backcross plants are more invasive than their parental species due to hybrid vigor. Consequently, those plants may negatively impact native biodiversity, especially native plants of grasslands and forest margins, by occupying similar habitats and competing for resources. We recommend further study to understand the ecological consequences of the hybrids and to clarify whether the genetic integrity of the parental species is maintained. In addition, further studies are needed to validate the hybrid origin of the Korean populations of R. japonicus.  Table S1). We also examined nine accessions of R. crispus and six of R. obtusifolius obtained from pure, non-mixed populations of each species to validate species-specific haplotypes of those species (Supplementary  Table S1). Because it occurs in some mixed populations of R. crispus and R. obtusifolius, we included R. japonicus in our study to evaluate its potential contribution to the nuclear and/or chloroplast genome of the putative hybrids. The individuals were collected in July 2020 and June-July 2021. Naesosa (35° 37′ N 126° 35′ E) refers to the site of an ancient Buddhist temple in Byeonsanbando National Park on the west coast of South Korea (Fig. 1).

Scientific Reports
It is located at the base of a mountain at an altitude of 30 m above sea level. The study sites in Naesosa were disturbed areas, including roadsides, abandoned agricultural fields, forest clearings and riparian zones. Gadeok (35° 02′ N 128° 49′ E) is an island in Busan, South Korea (Fig. 1). The sampled population on the island was on a roadside at the base of a mountain, at an altitude of about 40 m above sea level. Permission for collecting plant samples from the Naesosa area was obtained from Byeonsanbando National Park, Korea. The collection locations of the other plant samples were neither in protected areas nor on private land, and thus no permission was required for collections from these locations. Rumex acetosa L. (subgenus Acetosa) and R. acetosella L. (subgenus Acetosella) were selected as outgroups. Initial identifications of the individuals collected from the above populations were carried out by the authors based on differences in the major morphological characters of Rumex. Rumex crispus, R. obtusifolius and R. japonicus are distinguished mainly on the basis of differences in the shape of the leaves and the valves of their fruit. The basal and lower cauline leaves of R. crispus are lanceolate or oblong-lanceolate with strongly crisped margins. The valves are broadly ovate with entire margins. In contrast, the basal and lower cauline leaves of R. obtusifolius are usually ovate to elliptic or ovate-oblong with entire margins, and the valves are triangular-ovoid with spinose margins. Rumex japonicus has lanceolate to oblanceolate leaves somewhat similar to those of R. crispus, but it can be distinguished from the other two species by its reniform or broadly pentagonal valves with irregularly and shallowly toothed margins. Hybrid individuals of R. crispus and R. obtusifolius are usually intermediate in leaf shape between the two parental species. Most flowers of hybrid individuals failed to set fruit; they became dry and fell before any appreciable enlargement of the valves. A few flowers appeared to be fertile and set fruit, with the valves enlarging to varying degrees. All voucher specimens were deposited in Seoul National University Herbarium (SNU) with Herbarium IDs: 119,355-119,426. All experiments and methods were performed in accordance with relevant regulations and guidelines. DNA extraction, amplification, and sequencing. Total genomic DNA was extracted from leaf samples, either fresh or dried with silica gel, using the DNeasy Plant Mini Kit (QIAGEN, Germany). Five cpDNA regions (matK, rbcL-accD IGS, trnK-rps16 IGS, ycf6-psbM IGS and psbA-trnH IGS) and nDNA pgiC were amplified by polymerase chain reaction (PCR). Amplifications were carried out using a Veriti 96-Well Thermal Cycler (Applied Biosystems, USA) in 25 μL total volume containing 10-30 ng of genomic DNA, 1.25 U of EF-Taq DNA Polymerase (SolGent Co., Korea), 10 × EF-Taq Reaction Buffer with 25 mM MgCl 2 , 0.5 mM of each dNTP, 5% DMSO, and 0.4 μM of each primer. PCR and sequencing primers and PCR cycling conditions used in this study are provided in Table 3. The PCR products were purified using the enzymatic purification method 43 . Purified PCR products were sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Table 3. PCR and sequencing primers and PCR cycling conditions for cpDNA matK, rbcL-accD IGS, trnK-rps16 IGS, ycf6-psbM IGS and psbA-trnH IGS, and nDNA pgiC. Superscripts following primer names denote reference numbers. Primers pgiC_16F (5'-CAC AGC TTT TAC CAA CTG ATT C-3') and pgiC_21R (5'-CCT AAC TCA ACT CCC CAC-3') were designed by the authors. www.nature.com/scientificreports/ USA) following the manufacturer's instructions. The sequence products were purified by ethanol precipitation and then run on an ABI 3730xl DNA Analyzer (Applied Biosystems, USA) at Macrogen Inc., Korea.
Cloning of nDNA pgiC. Due to the presence of polymorphic nucleotide positions in some direct sequences, PCR products of the nDNA pgiC region for 24 accessions of putative hybrids and R. japonicus were cloned using the pGEM-T Easy Vector System I (Promega, USA) following the manufacturer's instructions. Transformed E. coli cells were spread onto LB agar plates with ampicillin (50 µg/mL), and incubated at 37 °C for 12-16 h. Ten to 15 colonies per plate were randomly selected, and colonies were then screened for inserts by PCR. Temperature and cycling conditions consisted of lysis of cells and initial denaturation for 3 min at 95 °C, followed by 35 cycles of 1 min denaturation at 95 °C, 30 s annealing at 50 °C, and 1 min 15 s elongation at 72 °C, with a 7 min final extension at 72 °C. The PCR products were purified and sequenced using the same procedure described above.
Sequence alignment and analyses. Nucleotide sequences were assembled and edited using Sequencher 5.4.6 (Gene Codes Corporation, USA). Edited sequences were aligned with Clustal X v. 2.0 with final manual adjustment using Se-Al v. 2.0a11 49,50 . All DNA sequences obtained in this study were deposited in GenBank (Supplementary Table S1). We identified a short inversion of 10 bp in the aligned sequence data set of ycf6-psbM IGS region. The inversion was replaced with the reverse complement of its sequence. As the inversion was assumed to be a single mutation event and appeared to have phylogenetic information, it was subsequently coded as a single binary character following the procedure suggested by Whitlock et al. 51 In addition, six indels ranging from 1 to 9 bp in length in the cpDNA sequence data set and one 271-bp indel in the nDNA pgiC sequence data set were coded and added to the data matrices as extra binary characters 52 . Phylogenetic analyses were performed on the combined cpDNA sequence data sets using maximum parsimony (MP) and Bayesian inference (BI). MP analyses were performed in PAUP*4.0a169 53 using a heuristic search strategy with 100 random sequence additions, tree bisection-reconnection (TBR) branch swapping, ACC TRA N, MULTREES on, MAXTREE set to no limit, and HOLD = 10 in effect. All characters were treated as unordered and equally weighted, and gaps were treated as missing data except for seven indels and one inversion coded as binary characters. Bootstrap (BS) analyses of 1000 replicates were conducted in PAUP* to evaluate support for clades using the same search parameters as in the MP analyses above 54 . For BI analyses, the optimal model of sequence evolution for each data set was identified using the Akaike information criterion (AIC) in MrModeltest 2.3 55 . The following models of sequence evolution were identified as optimal for the five cpDNA regions examined in this study; GTR + I for matK, HKY for rbcL-accD IGS, GTR + G for trnK-rps16 IGS and the combined cpDNA data set, GTR for ycf6-psbM IGS, F81 + I for psbA-trnH IGS. For nDNA pgiC region, the optimal model of sequence evolution was HKY + I. (Table 1). BI analyses were performed in MrBayes 3.2 56 using two independent runs of four chains (three heated and one cold) for one million generations. Trees were sampled every 1000 generations, and the first 25% were discarded as burn-in. The remaining trees were used to produce a 50% majority-rule consensus tree and determine posterior probabilities (PP). An unrooted haplotype network was constructed for the cpDNA data set using the parsimony method as implemented in the TCS 1.21 program 57 . Gaps were treated as single evolutionary events and indels as the fifth state of character. The 95% probability limit of parsimonious connections was applied to produce the network.